6 research outputs found

    Chameleon: Virtualizing Idle Acceleration Cores of A Heterogeneous Multi-Core Processor for Caching and Prefetching

    Get PDF
    Heterogeneous multi-core processors have emerged as an energy- and area-efficient architectural solution to improving performance for domain-specific applications such as those with a plethora of data-level parallelism. These processors typically contain a large number of small, compute-centric cores for acceleration while keeping one or two high-performance ILP cores on the die to guarantee single-thread performance. Although a major portion of the transistors are occupied by the acceleration cores, these resources will sit idle when running unparallelized legacy codes or the sequential parts of an application. To address this under-utilization issue, in this paper, we introduce Chameleon, a flexible heterogeneous multi-core architecture to virtualize these resources for enhancing memory performance when running sequential programs. The Chameleon architecture can dynamically virtualize the idle acceleration cores into a last-level cache, a data prefetcher, or a hybrid between these two techniques. In addition, Chameleon can operate in an adaptive mode which dynamically configures the acceleration cores between the hybrid mode and the prefetch-only mode by monitoring the effectiveness of Chameleon caching scheme. In our evaluation using SPEC2006 benchmark suite, different levels of performance improvements were achieved in different modes for different applications. In the case of the adaptive mode, Chameleon improves the performance of SPECint06 and SPECfp06 by 33% and 22% on average. When considering only memory-intensive applications, Chameleon improves the system performance by 53% and 33%

    Investigating a SoftCache via Dynamic Rewriting

    No full text
    Software caching via binary rewriting enables networked embedded devices to have the benefits of a memory hierarchy without the hardware costs. A software cache replaces the hardware cache/MMU mechanisms of the embedded system with software management of on-chip RAM using a network server as the backing store. The bulk of the software complexity is placed on the server so that the embedded system contains only the application's current working set and a small runtime system invoked on cache misses. We present a design and implementation of instruction caching using an ARM-based embedded system and a separate server and detail the issues discovered. We show that the software cache succeeds at discovering the small working set of several test applications for a reduction of 7 to 14X of the original application code. Further, we show that our software overhead remains small for typical functions in embedded applications. Finally, we discuss the implications of software caching for dynamic optimization, for power savings and for hardware design

    Software Caching using Dynamic Binary Rewriting for Embedded Devices

    No full text
    A software cache implements instruction and data caching entirely in software. Dynamic binary rewriting offers a means to specialize the software cache miss checks at cache miss time. We describe a software cache system implemented using dynamic binary rewriting and observe that the combination is particularly appropriate for the scenario of a simple embedded system connected to a more powerful server over a network. As two examples, consider a network of sensors with local processing or cell phones connected to cell towers. We describe two software cache systems for instruction caching only using dynamic binary rewriting and present results for the performance of instruction caching in these systems. We measure time overheads of 19% compared to no caching. We also show that we can guarantee a 100% hit rate for codes that fit in the cache. For comparison, we estimate that a comparable hardware cache would have space overhead of 12-18% for its tag array and would offer no hit rate guarantee

    The elusive metric for low-power architecture research

    No full text
    The Energy-Delay product or ED product was proposed as a metric to gauge design effectiveness. This metric is widely used in the area of low-power architecture research, however it is also often improperly used when reporting a new architecture design that addresses energy-performance effectiveness. In this paper, we discuss two common fallacies from the literature: (1) the way the ED product is calculated, and (2) issues of introducing additional hardware structures to reduce dynamic switching activities. When using the ED product without meticulous consideration, a seemingly energy-efficient design could turn out to be a more energy-consuming one. 1

    InfoShield: A Security Architecture for Protecting Information Usage in Memory

    No full text
    Cyber theft is a serious threat to Internet security. It is one of the major security concerns by both network service providers and Internet users. Though sensitive information can be encrypted when stored in non-volatile memory such as hard disks, for many e-commerce and network applications, sensitive information is often stored as plaintext in main memory. Documented and reported exploits facilitate an adversary stealing sensitive information from an application’s memory. These exploits include illegitimate memory scan, information theft oriented buffer overflow, invalid pointer manipulation, integer overflow, password stealing trojans and so forth. Today’s computing system and its hardware cannot address these exploits effectively in a coherent way. This paper presents a uni ed and lightweight solution, called InfoShield, that can strengthen application protection against theft of sensitive information such as passwords, encryption keys, and other private data with a minimal performance impact. Unlike prior whole memory encryption and information flow based efforts, InfoShield protects the usage of information. InfoShield ensures that sensitive data are used only as de ned by application semantics, preventing misuse of information. Comparing with prior art, InfoShield handles a broader range of information theft scenarios in a uni ed framework with less overhead. Evaluation using popular network client-server applications shows that InfoShield is sound for practical use and incurs little performance loss because InfoShield only protects absolute, critical sensitive information. Based on the pro ling results, only 0.3 % of memory accesses and 0.2 % of executed codes are affected by InfoShield. 1
    corecore